Abstract: The big data is the
concept of largespectrum of data, which is being
created day by day. In recent years handling these datais the biggest challenge. Hadoop
is an open source platform which is used effectively to handle the big data
applications. The two core concepts of the hadoop are
Mapreduce and Hadoop
distributed file system (HDFS). HDFS is the storage mechanism and map reduce is
the programming language. Results are produced faster than other traditional
database operations. Pig and Hive are the two language
which helps us to program the mapreduce framework
within short period of time.
Keywords: MapReduce, Pig, Hive, Big data, Hadoop, HDFS.